Goto

Collaborating Authors

 simpler model




Using Noise to Infer Aspects of Simplicity Without Learning

Neural Information Processing Systems

Noise in data significantly influences decision-making in the data science process. In fact, it has been shown that noise in data generation processes leads practitioners to find simpler models. However, an open question still remains: what is the degree of model simplification we can expect under different noise levels? In this work, we address this question by investigating the relationship between the amount of noise and model simplicity across various hypothesis spaces, focusing on decision trees and linear models. We formally show that noise acts as an implicit regularizer for several different noise models. Furthermore, we prove that Rashomon sets (sets of near-optimal models) constructed with noisy data tend to contain simpler models than corresponding Rashomon sets with non-noisy data. Additionally, we show that noise expands the set of ``good'' features and consequently enlarges the set of models that use at least one good feature. Our work offers theoretical guarantees and practical insights for practitioners and policymakers on whether simple-yet-accurate machine learning models are likely to exist, based on knowledge of noise levels in the data generation process.



We thank the reviewers for the thoughtful comments and attempt to address their questions, space permitting

Neural Information Processing Systems

We thank the reviewers for the thoughtful comments and attempt to address their questions, space permitting. We acknowledge that the magnitudes of the presented effects (i.e. The accuracies we observe are on par with other reported single-trial MEG accuracies[36]. We will incorporate this in the discussion section. Such a model may exhibit less catastrophic forgetting when learning new tasks.


Using Noise to Infer Aspects of Simplicity Without Learning

Neural Information Processing Systems

Noise in data significantly influences decision-making in the data science process. In fact, it has been shown that noise in data generation processes leads practitioners to find simpler models. However, an open question still remains: what is the degree of model simplification we can expect under different noise levels? In this work, we address this question by investigating the relationship between the amount of noise and model simplicity across various hypothesis spaces, focusing on decision trees and linear models. We formally show that noise acts as an implicit regularizer for several different noise models.


Low-Resolution Neural Networks

Cabral, Eduardo Lobo Lustosa, Driemeier, Larissa

arXiv.org Artificial Intelligence

The expanding scale of large neural network models introduces significant challenges, driving efforts to reduce memory usage and enhance computational efficiency. Such measures are crucial to ensure the practical implementation and effective application of these sophisticated models across a wide array of use cases. This study examines the impact of parameter bit precision on model performance compared to standard 32-bit models, with a focus on multiclass object classification in images. The models analyzed include those with fully connected layers, convolutional layers, and transformer blocks, with model weight resolution ranging from 1 bit to 4.08 bits. The findings indicate that models with lower parameter bit precision achieve results comparable to 32-bit models, showing promise for use in memory-constrained devices. While low-resolution models with a small number of parameters require more training epochs to achieve accuracy comparable to 32-bit models, those with a large number of parameters achieve similar performance within the same number of epochs. Additionally, data augmentation can destabilize training in low-resolution models, but including zero as a potential value in the weight parameters helps maintain stability and prevents performance degradation. Overall, 2.32-bit weights offer the optimal balance of memory reduction, performance, and efficiency. However, further research should explore other dataset types and more complex and larger models. These findings suggest a potential new era for optimized neural network models with reduced memory requirements and improved computational efficiency, though advancements in dedicated hardware are necessary to fully realize this potential.


Double and Single Descent in Causal Inference with an Application to High-Dimensional Synthetic Control

Spiess, Jann, Imbens, Guido, Venugopal, Amar

arXiv.org Machine Learning

Motivated by a recent literature on the double-descent phenomenon in machine learning, we consider highly over-parameterized models in causal inference, including synthetic control with many control units. In such models, there may be so many free parameters that the model fits the training data perfectly. We first investigate high-dimensional linear regression for imputing wage data and estimating average treatment effects, where we find that models with many more covariates than sample size can outperform simple ones. We then document the performance of high-dimensional synthetic control estimators with many control units. We find that adding control units can help improve imputation performance even beyond the point where the pre-treatment fit is perfect. We provide a unified theoretical perspective on the performance of these high-dimensional models. Specifically, we show that more complex models can be interpreted as model-averaging estimators over simpler ones, which we link to an improvement in average performance. This perspective yields concrete insights into the use of synthetic control when control units are many relative to the number of pre-treatment periods.


Why Simple Models Are Often Better

#artificialintelligence

In data science and machine learning, simplicity is an important concept that can have significant impact on model characteristics such as performance and interpretability. Over-engineered solutions tend to adversely affect these characteristics by increasing the likelihood of overfitting, decreasing computational efficiency, and lowering the transparency of the model's output. The latter is particularly important for areas that require a certain degree of interpretability, such as medicine and healthcare, finance, or law. The inability to interpret and trust a model's decision -- and to ensure that this decision is fair and unbiased -- can have serious consequences for individuals whose fate depends on it. This article aims to highlight the importance of giving precedence to simplicity when it comes to implementing a data science or machine learning solution.


Mistakes To Avoid as an AI Practitioner in Industry

#artificialintelligence

She discusses the importance of knowing when AI is actually the appropriate solution, the value of domain expertise on a project, and other key factors in successful AI applications. I'm going to tell you mistakes to avoid if you want to be an AI practitioner in the industry, especially if you are coming from an academic mindset. Around 90% of total machine learning models that we build in a company or in a research lab, don't make it to production. One in ten data scientists' AI solutions end up being a part of products. Nine of the data scientists' solutions either get discarded, discontinued, or have to pivot. I will highlight twelve mistakes that are really crucial to avoid if you want to make a successful deployment to the production of an AI-based solution.